CHARMM:The Basics

From CHARMM tutorial
Jump to: navigation, search

Contents

Introduction

What is CHARMM?

CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a computational package used for classical, quantum and hybrid quantum/classical simulations. CHARMM is a complex program with over 500,000 lines of code and offers many analytical features. The program employs a command line interface and in practice most interaction with the program is done via user-created input scripts.

The original CHARMM reference and a brief update are:

CHARMM: A Program for Macromolecular Energy, Minimization, and Dynamics Calculations, J. Comp. Chem. 4, 187-217 (1983), by B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M. Karplus.

CHARMM: The Energy Function and Its Parameterization with an Overview of the Program, in The Encyclopedia of Computational Chemistry, 1, 271-277, P. v. R. Schleyer et al., editors (John Wiley & Sons: Chichester, 1998), by A. D. MacKerell, Jr., B. Brooks,C. L. Brooks, III, L. Nilsson, B. Roux, Y. Won, and M. Karplus.

Up to date information about CHARMM may be found at www.charmm.org.

Who developed CHARMM?

CHARMM was created by the Martin Karplus Group at Harvard but developers from across the globe are continually working to improve CHARMM and add new features. A partial list of current CHARMM developers can be found here.

What are some of the features of CHARMM?

  • Robust Langevin and molecular dynamics simulation engine
  • Extensive trajectory analysis features.
  • Much of the code is parallelized for execution on SMP machines and clusters.
  • Multiscale modeling (e.g. QM/MM simulations).
  • Interfaces to various QM packages.
  • Embedded X11 graphics subsystem.
  • Facilities for normal mode and quasiharmonic analysis.
  • For more information see the CHARMM Documentation.

Executing CHARMM

CHARMM can be called with the command line as:

charmm_executable < charmm_input_script.inp > charmm_output_file.out

it should also be noted here that CHARMM can accept command line arguments and set them to CHARMM variables that can be used in input scripts. For example:

charmm_executable nsteps=100 < charmm_input_script.inp > charmm_output_file.out


Molecule Metadata

Molecular modeling software relies on having standard formats for reading molecular structures. One of the most common such formats is the Protein Data Bank (PDB) file. When starting a project, the structure of a molecule is often initially read into CHARMM from a PDB file, which contains information about the component atoms. Atoms are grouped into residues; for example, each amino acid is a residue. Residues are further grouped into chains, or segments, which represent major functional units of the protein. Within the PDB file, the type of each atom, its residue name (RESName) and residue identification number (RESId), and its segment identifier (SEGId) are all specified.

Once a structure is read into CHARMM, the software creates its own internal data structures to represent it. These are discussed in the next subsection.

CHARMM data structures

Residue Topology File (RTF) This file defines groups by including the atoms, the properties of the group, and bond and charge information. CHARMM has standard Residue Topology Files for nucleic acids, lipids, proteins and carbohydrates.

Parameter File (PARA or PARM) This file determines the energy associated with the structure by defining bond, angle and torsion force constants and van der Waals parameters. CHARMM has standard parameter files for nucleic acids, lipids, proteins and carbohydrates.

Coordinates (COOR) These are the standard Cartesian coordinates of the atoms in the system. These are typically read in or written out in PDB or CHARMM card (CRD) file format. The card format keeps track of additional molecule information that can be useful for manipulation (i.e. residue name, segment name, segment id, resdiue id, etc.). Below is an example of a .CRD file and the information in contains:

 title = * WATER
 title = *  DATE:     4/10/07      4:25:51      CREATED BY USER: USER
 title = *
 Number of atoms (NATOM)       = 6
 Atom number (ATOMNO)          = 1 (just an exmaple)
 Residue number (RESNO)        = 1
 Residue name (RESName)        = TIP3
 Atom type (TYPE)              = OH2
 Coordinate (X)                = -1.30910
 Coordinate (Y)                = -0.25601
 Coordinate (Z)                = -0.24045
 Segment ID (SEGID)            = W
 Residue ID (RESID)            = 1
 Atom weight (Weighting)       = 0.00000
 
 now what that looks like...
 
 * WATER
 *  DATE:     4/10/07      4:25:51      CREATED BY USER: USER
 *
     6
    1    1 TIP3 OH2   -1.30910  -0.25601  -0.24045 W    1      0.00000
    2    1 TIP3 H1    -1.85344   0.07163   0.52275 W    1      0.00000
    3    1 TIP3 H2    -1.70410   0.16529  -1.04499 W    1      0.00000
    4    2 TIP3 OH2    1.37293   0.05498   0.10603 W    2      0.00000
    5    2 TIP3 H1     1.65858  -0.85643   0.10318 W    2      0.00000
    6    2 TIP3 H2     0.40780  -0.02508  -0.02820 W    2      0.00000

Protein Structure File (PSF) The PSF holds lists of every bond, bond angle, torsion angle, and improper torsion angle as well as information needed to generate the hydrogen bonds and the non-bonded list. It is essential for the calculation of the energy of the system.

Internal Coordinates (IC) This data structure determines the internal coordinates for atoms and is used for analyzing the data. Internal coordinates represent the coordinates of one atom relative to another rather than a standard Cartesian plane.

Non-Bonded list (NBONds) This is a list of non-bonded atoms. It is used in calculating energy, dipole moments, and quadrapole moments. The non-bonded list does contain atoms that are in atom-to-atom contact and engaging in van der Waals interactions.

Constraints (CONS) Constraints fix atoms in exactly one position during the simulation. This information is stored internally in the IMOVe array.

Images Data Structures (IMAGe) This data structure is used to help create symmetrical structures and contains bond information. This is a general image support system that allows the simulation of almost any crystal and also finite point groups. There is also a facility to introduce bond linkages between the primary atoms and image atoms. This allows infinite polymers, such as DNA to be studied. For infinite systems, an asymmetric unit may be studied because rotations and reflections are allowed transformations.

Crystal Data Structures (CRYStal) The crystal module is an extension of the image facility within the CHARMM program that allows calculations on crystals to be performed. It is possible to build a crystal with any space group symmetry, to optimize its lattice parameters and molecular coordinates and to carry out a vibrational analysis using the options. All crystal commands are invoked by the keyword CRYStal.

Entering Commands into CHARMM

CHARMM is controlled via a command line interface, and can be used either interactively or with a pre-written script. The overwhelming majority of the time, you will use a script. A script is simply a collection of CHARMM commands that are usually executed in sequence (although looping and branching is possible, as will be discussed later on in this tutorial).

In this tutorial, literal CHARMM commands and their output will be either displayed in a fixed font, e.g. foo bar spam or set off as follows:

This text is offset.
It's in a fixed font too!

Except when providing options to the DYNAmics command (discussed later) you only need to spell out the first four letters of each command or option. Furthermore, all CHARMM commands are case insensitive. For example, to run the minimization command you can type any of MINI, mini, minimize, Minimization, or mInImIz (although the last one is kind of silly).

Basic CHARMM Commands

CHARMM commands may be entered via standard input or (usually) via an input file. CHARMM in most cases only reads the first four letters of a command and commands are not case-sensitive. For example, "GENERATE", "generate", "GENE", and "gene" all execute the same command. Occasionally, (e.g., some options to the DYNAmics command) six letters are required.

However, case-sensitivity is an issue when filenames are concerned since, usually, the underlying file system is case-sensitive, i.e., /home/user/Charmm/file.psf and /home/user/charmm/file.psf are not equivalent. By default, CHARMM first converts a filename in all uppercase letters and then tries all lowercase letters. Thus, if you specified a file as /home/user/Charmm/file.psf CHARMM would look for /home/user/charmm/file.psf and /HOME/USER/CHARMM/FILE.PSF and fail to find the intended file. If you have files (pathnames) containing both lower and upper case characters, you can work around this default behavior by enclosing the path in quotation marks, e.g. by specifying "/home/user/Charmm/file.psf".

Each CHARMM script must begin with a title, which is an arbitrary number of lines beginning with asterisks (*) and containing whatever text the user would like to identify the script. The end of a title is specified by a line with only a single asterisk on it. An example title is:

* This is my CHARMM script
*

A CHARMM script should end with the STOP command.

CHARMM supports basic logical and mathematical operations within input scrips and can parse IF, GOTO, LABEL, STREam, RETUrn, SET, INCRement, and DECRement statements.

The IF statement works like any other if statement in programming and it has the following test cases:


.GT. greater than
.LE. less than
.GE. greater than or equal to
.LE. less than or equal to
.EQ. equal to
.NE. not equal to
.AE. almost equal (diff<0.0001)

Example:

IF phi .EQ. 1.618
     SET pi = 3.14

For the GOTO command a LABEL must be established first and if there is more than one label with the same string then GOTO goes to the first label. This is typically used to create loops in CHARMM scripts.

Example:

SET i = 1
LABEL LOOP

do something...
INCRement i by 1
IF @i .LT. 10 THEN GOTO LOOP

STREAM allows another file to be directly imported to the input file. This can help split up one large input file to keep the input script clean and allow the user to import commonly used commands with a single line. This is analogous to reusable subroutines in programing. A file that will be streamed in must have a title and rather than a "stop" command at the end of the input file, there should be a "return" command. An example is given below.

 *File to be streamed in. File name is stream_example.str
 *
 
 OPEN UNIT 1 READ CARD NAME /path/to/coordinate_file.crd
 READ COOR UNIT 1 CARD
 CLOSE UNIT 1
 
 RETURN
*Input file that streams in a file
*
 
 STREam stream_example.str

The SET command for CHARMM is a way to assign numbers to variables. DEFIne is used for strings.

Example:

SET phi = 1.618

When a value is set, it will stay the same value even if used in another command until the user manually changes the value.

CALC is used for calculations. CHARMM requires spaces between numbers, operations, and parenthesis

Example:

calc pi = 2.14 + 1

Result: Variable "pi" is set to 3.14

DEFIne may be used as:

DEFI bigprotein SELEct SEGID A END

In this situation, bigprotein will hold the atom selection. the word "bigprotein" can be used as an alias for all the atoms that have the segment ID (SEGId) A.

There are many different options for the select command. More information on the SELEct command may be found in the CHARMM documentation (select.doc).

To continue lines in a CHARMM input script use a hyphen (-). To add comments to the CHARMM input script use an exclamation mark (!).

Example:

 ! Define the region
 DEFIne REGIon SELE ATOM A* 1 C1 OR. ATOM A* 1 C2 -
  .OR. ATOM A* 1 H1 .OR. ATOM A* 1 H2 -
  .OR. ATOM A* 1 H3 .OR. ATOM A* 1 H4 -
  .OR. ATOM A* 1 H5 SHOW END

The amount of information printed out by CHARMM during execution can be controlled with the WRNLev and PRNLev commands. Warnings are assigned levels between -5 and 5, with lower numbers indicating a higher severity. WRNLev sets the least severe warning that will be displayed to the user. default value is 5 (all warnings are printed). , non-warning output statements are assigned a priority from 0 to 10 and only those less than the current PRNLev are displayed. default value for PRNLev is 5.

The BOMBlev (or BOMLev) command is used to set the least severe warning that will cause CHARMM to abort execution. The default value is 0 and running with a BOMBlev less than -2 is generally not recommended because serious errors can affect the validity of calculations. All warnings at or beneath level 0 should be carefully scrutinized to make sure that they do not cause issues with the results.

Basic Input Scripts

The features and capabilities of CHARMM can all be used through input scripts. The CHARMMing web site, http://www.charmming.org, contains an integrated set of tools for uploading structures, performing simulations, and viewing the results. In order to provide the best possible user experience, it incorporates a number of freely available tools such as Jmol for visualization and an automatic residue topology file (RTF) generator (GENRTF) which generates the necessary information for atoms and residues that are currently not supported by the CHARMM force fields. Below is a partial list of functionality that currently is incorporated into CHARMMing:

  • PDB/CRD reader and input script generator
  • Integrated molecular graphics
  • Integrated simulation tools (i.e. minimization, solvation, dynamics)
  • Automatic topology generation

This tutorial will show how the CHARMM scripts that accomplish these tasks work.

Preparation: The PDB that will be used along with this tutorial has a PDB.org ID of 1YJP. Uploading this PDB to CHARMMing will separate it into two parts: one containing the protein and the other containing the waters ("good" HETATM file). If you use charmming org, these files are named new_1yjp_xxxxx-1-a.pdb and new_1yjp_xxxxx-1-a-goodhet.pdb. All files that are used in the scripts discussed below can also be downloaded from: http://www.charmmtutorial.org/static/files/tutorial-files.tar.gz. Within this tar.gz archive, the split pdb files of interest are named ./tutorial-files/pdb/1yjp_protein.pdb and ./tutorial_files/pdb/1yjp_waters.pdb. (Of course, the user can also split the original pdb manually.) Note: Because of a change to the pdb reader in version 35 of CHARMM, processing these files with c35b1 or higher may not work! See "Known Bugs" in "About charmming.org" at http://www.charmming.org (you have to log in to get to this link!)

First, the tutorial will cover dealing with the protein part of the PDB because waters must be treated differently. To start off the script, the title lines are needed.

* Title
*

The next step is to read in the files containing the residue topology (RTF) and parameters (PARAm). Either the user can supply their own topology and parameter files or use CHARMMs supplied files. Either way, the pathway to the files must be explicitly stated in the input script.

After the title, start off the line with "OPEN" and the next step is to assign a unit number to the "OPEN" command. This can be done by "UNIT" followed by a space and number. The unit number acts as a pointer to the file. This will be familiar to users who have done Fortran programming. Next, follow the unit number by "CARD" as this tells CHARMM that the file is to be interpreted as formatted ASCII text as opposed to binary data (in fact "FORM" or FORMatted can be used as a synonym for "CARD"). For example, when reading coordinates from a formatted .crd file, CHARMM knows that the X coordinates are in column 5. It is important to realize that columns are defined by character position instead of white space, and therefore it is imperative that files be formatted correctly. Tell CHARMM to read the file by using the command "READ NAME" followed by the path of the topology file. The whole line should now look like this:

OPEN UNIT 1 CARD READ NAME /charmm/c34b2/toppar/top_all27_prot_na.rtf

Now that CHARMM has opened a file and assigned a pointer to it (UNIT 1), it is time to to read the residue topology information. Use the "READ" command with the RTF keyword to tell CHARMM to read data from the file. The "CARD" keyword and "UNIT" number should again be provided. The complete command should be:

READ RTF CARD UNIT 1

CHARMM now reads the data in from the file, and will print out the title of the file it read (assuming there is a title) onto standard output. Since a new file was recently opened, it is now time to close the file and the unit number can be reused. To do this, use a "CLOSE UNIT 1" statement. The RTF importation should now look like this:

*CHARMM Tutorial
*

!Read in Topology and Parameter files
OPEN UNIT 1 CARD READ NAME /charmm/c32b2/toppar/top_all27_prot_na.rtf
READ RTF CARD UNIT 1
CLOSE UNIT 1

(Note: An exclamation mark (!) tells CHARMM to disregard any other character to its right. Exclamation marks are used to indicate a comment)

For the parameter file, do the same thing except change RTF to PARA and reference the correct parameter file. The line should now read:

OPEN UNIT 1 CARD READ NAME /charmm/c34b2/toppar/par_all27_prot_na.prm
READ PARA CARD UNIT 1

Once the parameter and topology files have been read in by CHARMM, the coordinate file must be read in. PDB files (typical starting point) contain the sequence of residues (the amino acids making up the protein) as well as the coordinates of the heavy atoms (the coordinates of the hydrogens are generally not available). Opening the file follows the same procedures as the RTF and Parameter files; however, reading in the files from a .pdb is slightly different. The PDB must be read twice, the first time for the sequence and the second time for the coordinates. To do the first half of this, use the "SEQU" (short for sequence) after the "READ" command. CHARMM has no way of knowing the user is reading in a PDB file without the user explicitly saying so; therefore, follow "SEQU" with "PDB". The line should now look like this:

READ SEQU PDB UNIT 1

Now that the sequence has been read in the protein structure file (PSF) has to be generated. The "GENErate" command builds this data structure from the topology information and sequence. It constructs the internal coordinate (IC) table. There must be a label (segment ID) CHARMM can associate with this protein and that follows the "GENErate" command. Usually segment IDs (called chain IDs by some software) are one character, but this is not a requirement in CHARMM. In this example, we will use "A" as the segment ID. The name of the segment is given directly after the GENErate command. The "SETUp" command creates the IC table. The generate command should look like this:

GENErate A SETUp

Once the sequence is read in and the internal coordinate table is generated, it is necessary to move the file pointer back to the beginning of the file so that the coordinates can be read "from the top." This can be done by closing and reopening the file or alternatively with the rewind command:

REWInd UNIT 1

With the file rewound the coordinates can be read by telling CHARMM to read PDB formatted Cartesian coordinates:

READ COOR PDB UNIT 1

After a script is done reading a file, it should be closed. Closing a PDB file follows the same procedure as the topology and parameter files.

The script should now resemble:

 *CHARMM Tutorial
 *
 
 !Read in RTF and Parameter files
OPEN UNIT 1 CARD READ NAME /charmm/c32b2/toppar/top_all27_prot_na.rtf
READ RTF CARD UNIT 1
CLOSE UNIT 1

OPEN UNIT 1 CARD READ NAME /charmm/c32b2/toppar/par_all27_prot_na.prm
READ PARA CARD UNIT 1
CLOSE UNIT 1


!Read in the pdb
OPEN UNIT 1 CARD READ NAME /path/to/pdbfile.pdb
READ SEQU PDB UNIT 1

GENErate A SETUp

REWInd UNIT 1

READ COOR PDB UNIT 1
CLOSE UNIT 1

Now that CHARMM knows about the Cartesian coordinates, the internal coordinate table, which was generated above, can be filled in with useful information. The "IC FILL" command fills the internal coordinates based on the information from the Cartesian coordinates. These, however, may not be sufficient to construct all of the internal coordinates so the "IC PARAM" command should be used to fill in information based on the parameters. Now unknown Cartesian coordinates for non-hydrogens can be built from the internal coordinates by a "IC BUILD" command. The commands look like the following:

!Involves known atomic coordinates and fills in missing coordinates
IC FILL

!Will retrieve standard coordinates from the parameter file
IC PARAmeters
!Will find the Cartesian coordinates previously unknown
IC BUILd

The above sequence of commands is a generic way to add as many missing coordinates as possible based on internal coordinate information obtained from the RTF and the parameter file. Since protein structures submitted to the PDB usually do not contain coordinates for hydrogens, CHARMM provides a tool to add coordinates for hydrogens, HBUIld. The quality of coordinates generated with HBUIld is in general much better than that of the coordinates obtained from the IC commands. Thus, one should follow the above sequence of commands by

!Rebuild hydrogen coordinates
HBUIld SELE ALL END

HBUILd may have problems when given an atom selection. In these cases it is best to try running HBUIld by itself with no arguments.

Any missing coordinates are set to 9999.0. These can be displayed with:

PRINt COORdinate SELEct .NOT. INITialized END

After the "IC" and "HBUIld" commands have been called, the newly-generated PSF and coordinates can be written into files. Coordinates can be written out as PDB files or as CHARMM formatted coordinate (CRD) files. First, the files must be opened using the "OPEN" command. Next, a "WRITe" command will tell CHARMM the intent of the user. Since a file has been opened, it must have an assigned pointer by a unit number, which works the same way as for a read command. The card format is also specified during the open, just as for a read and work in exactly the same way. After the write command is initiated, include "CARD". Following "CARD", and just like reading in a file, a "NAME" command is used followed by the path to where the user wants the PSF file to be saved.

After the open command, use the "WRITe" command followed by "PSF" then "CARD" and finally, point to "UNIT 1". The open and write lines for the Protein Structure File should now look like this:

OPEN UNIT 1 WRITe CARD NAME /path/to/new/file.psf
WRITe PSF CARD UNIT 1

Immediately after the write command a title for the newly written file should be given in the usual format. All together, the write statement for a PSF file should resemble the following:

OPEN UNIT 1 WRITe CARD NAME /path/to/new/file.psf WRITe PSF CARD UNIT 1

Immediately after the write command a title for the newly written file should be given in the usual format. All together, the write statement for a PSF file should resemble the following:

OPEN UNIT 1 WRITe CARD NAME /PATH/TO/NEW/FILE.PSF
WRITe PSF CARD UNIT 1
*PSF File
*

CHARMM will automatically close the file after the title and data are written. Therefore, it is not necessary (and in fact will cause a warning) to add a close command after this.

Terminate the script with a simple "STOP".

The input file should now look like the following:

 *CHARMM Tutorial
 *
 
 !Read in RTF and Parameter files
 OPEN UNIT 1 CARD READ NAME /charmm/c32b2/toppar/top_all27_prot_na.rtf
 READ RTF CARD UNIT 1
 CLOSE UNIT 1
 
 OPEN UNIT 1 CARD READ NAME /charmm/c32b2/toppar/par_all27_prot_na.prm
 READ PARA CARD UNIT 1
 CLOSE UNIT 1
 
 !Read in the pdb
 OPEN UNIT 1 CARD READ NAME /path/to/pdbfile.pdb
 READ SEQU PDB UNIT 1
 
 !Time for CHARMM to make the Internal Coordinates table
 GENErate A SETUp
 
 REWInd UNIT 1
 READ COOR PDB UNIT 1
 CLOSE UNIT 1
 
 
 !Involves known atomic coordinates and fills in missing coordinates

 IC FILL
 
 !Will retrieve standard coordinates from the parameter file

 IC PARAmeters
 
 !Will find the Cartesian coordinates previously unknown
 IC BUILd
 
 !Rebuild all hydrogen coordinates
 HBUILD SELEct HYDRogen END
 
 !Check for remaining missing atoms
 PRINt COOR SELEct .NOT. INITialized END
 
 OPEN UNIT 1 WRITe CARD NAME /path/to/new/protein_file.psf
 WRITe PSF CARD UNIT 1
 * PSF File
 *

 OPEN UNIT 1 WRITe CARD NAME /path/to/new/protein_file.pdb
 WRITe COOR PDB UNIT 1
 *PDB with Coordinates
 *
 
 OPEN UNIT 1 WRITe CARD NAME /path/to/new/protein_file.crd
 WRITe COOR CARD UNIT 1
 *CRD File
 *

 STOP

Adding crystal waters to the system: The 1YJP PDB file (as many other structures in the PDB) contains not only coordinates for the atoms that are actually part of the protein, but also coordinates for some embedded waters (or rather the positions of the water oxygens). Biological systems (e.g. proteins) generally occur in solution and water molecules often become embedded in pockets and are subsequently picked up during NMR and X-Ray crystallography. A file called 1yjp_waters.pdb containing the positions of the water molecules has been created and is part of the set of files found in the tar.gz archive at http://www.charmmtutorial.org. A new input script will be made to read these in. The title, topology, and parameter files should all be read in as done above; however, this time read in the PDB file containing the waters rather than the protein PDB.

Because the water PDB file is now being dealt with, there are no bond angles to deal with explicitly (they are implicit within the water model), and no dihedral angles at all (as H2O has only three atoms). Therefore, replace the protein input scripts "GENErate" command with:

GENErate A NOANgle NODIhedral SETUp

An additional modification to the read command must be made. It should be changed to READ COOR PDB UNIT 1 since CHARMM will want to read in the coordinates straight from the PDB file. If this is not done, CHARMM will try to generate its own coordinates as though the water was bonded to one another, which would then result in CHARMM creating undefined coordinates. A PRINt COOR statement should also be added for the users benefit to make sure everything went well.

Also, the "IC" commands can be removed, but leave "HBUIld" as the water model only contains the position of the Oxygen atom for each water, requiring that the hydrogen positions still be explicitly built. The write commands should be similar to the ones above but rename the output files so they are not overwritten.

The CHARMM water input file should resemble the following:

 * Run Segment Through CHARMM
 *
 bomlev -2
 
 ! Read in Topology and  Parameter files
OPEN UNIT 1 CARD READ name top_all27_prot_na.rtf
READ RTF CARD UNIT 1
CLOSe UNIT 1

OPEN UNIT 1 CARD READ NAME par_all27_prot_na.prm
READ PARA CARD UNIT 1
CLOSe UNIT 1

!Read sequence from the PDB coordinate file
OPEN UNIT 1 CARD READ NAME 1yjp-a-goodhet.pdb
READ SEQU PDB UNIT 1
CLOSe UNIT 1

!now generate the PSF and also the IC table (SETU keyword)
GENErate A NOANg NODIhedral SETUp

!Read protein coord from the PDB coordinate file
OPEN UNIT 1 CARD READ NAME 1yjp-a-goodhet.pdb
READ COOR PDB UNIT 1
CLOSe UNIT 1

HBUILd SELEct ALL END

!write out the protein structure file (psf) and
!the coordinate file in pdb and crd format.
 
 OPEN UNIT 1 CARD WRITe NAME 1yjp_waters.pdb
 WRITe COOR PDB UNIT 1
 * PDB Coords
 *
 
 OPEN UNIT 1 CARD WRITe NAME filename.crd
 WRITe COOR CARD UNIT 1
 * Coords
 *
 
 OPEN UNIT 1 CARD WRITe NAME filename.psf
 WRITe PSF CARD UNIT 1
 * PSF
 *
 
 STOP

Merging the two PSFs: Once PSFs have been generated for both the (crystal) waters and the protein, it is necessary to construct an input file merging the two PSFs to make one unified system. The proteins PSF and CRD files should be read in and then the water PSF and CRD should be read in. When reading in the water PSF, the "APPEnd" keyword should be used. This will append the water PSF to the protein PSF. For the coordinate read command use:

READ COOR IGNOre UNIT 1 SELEct RESN TIP3 END

The IGNOre keyword turns off CHARMMs checking feature which would normally throw warnings since the PSF and CRD files for the waters and the protein do not match. Next, something must be selected using the SELEct command. Since all of the residue names are TIP3 in the water file, the RESName to select can be used. Finish off the script with commands to write out the new unified PSF, PDB, and CRD files.

The append script should resemble the following:

 * Append the PDBs
 *
 bomlev -2
 
 ! Read in Topology and  Parameter files
OPEN UNIT 1 CARD READ NAME top_all27_prot_na.rtf
READ RTF CARD UNIT 1
CLOSe UNIT 1

OPEN UNIT 1 CARD READ NAME par_all27_prot_na.prm
READ PARA CARD UNIT 1
CLOSe UNIT 1

! Read PSF from file
OPEN UNIT 1 FORM READ NAME 1yjp.psf
READ PSF CARD UNIT 1
CLOSe UNIT 1

OPEN UNIT 1 FORM READ NAME 1yjp_waters.psf
READ PSF CARD APPEnd UNIT 1
CLOSe UNIT 1

! Read protein coord from the coordinate file
OPEN UNIT 1 CARD READ NAME 1yjp.crd
READ COOR CARD UNIT 1 RESID
CLOSE UNIT 1

OPEN UNIT 1 CARD READ NAME 1yjp_waters.crd
READ COOR RESId IGNOre UNIT 1 SELEct SEGId A .AND. RESN TIP3 END
CLOSE UNIT 1

! redo hydrogen coordinates for the complete structure
 COOR INIT SELE HYDRogen END
 HBUILd
 
 ENERgy
 
 OPEN UNIT 1 CARD WRITe NAME 1yjp-full.psf
 WRITe PSF CARD UNIT 1
 * PSF
 *
 
 OPEN UNIT 1 CARD WRITe NAME 1yjp-full.pdb
 WRITe COOR PDB UNIT 1
 * Coords
 *
 
 OPEN UNIT 1 CARD WRITe NAME 1yjp-full.crd
 WRITe COOR CARD UNIT 1
 * Coords
 *
 
 stop
 



Personal tools
Navigation