Re: CCL:still need help: pdb --> charmm
On Wed, 30 Jan 2002, J. Zheng wrote:
> My research is molecular modeling of protein adsorption. What I want is
> to get initial configuration from PDB, then convert to Charmm Format.
> It is only way that I can use CHARMM force field. So, from charmm program,
> I want to get
>
> 1) COOR file which includes atomtype of charmm.
Depends on what you mean by "atomtype"-- the COOR file really contains
just atom names; the atom parameter type info is stored in the PSF as an
integer type code, which corresponds to to those declared with MASS
statements at the beginning of the RTF.
> 2) PSF file which includes connectivity table.
It also enumerates the energy terms, which includes bonding,
i.e. connectivity, as well as all angles and torsions which will be
explicitly evaluated.
> Based on these two files, I will code by myself for a speicfic protein
> adsorption. In order to do this project, our group purchase CHARMM. But
> nobody know how to use it. That is why I ask you guys for helping.
>
> After 2 days, I have got prelimitary results. But I don't know whether
> it is right. If correct, I will summary what I do and what you guys
> suggested later. otherwise, I have to still struggle on this conversion.
>
> -------------------------------------------------------------------------
>
> 1. Charmm program
>
> suggested by Torsten Becker <torsten.becker |-at-|
iwr.uni-heidelberg.de>
> Xiang(Simon) WANG <simwang |-at-| chem.ufl.edu>
> Marcin Krol <mykrol |-at-| cyf-kr.edu.pl>
> Dr. Richard L. Wood <rlw28 |-at-| cornell.edu>
> Szilveszter Juhos <szilva |-at-| ribotargets.com>
>
> following I will list COMMAND I inputed. (based on Becker's script)
>
> ok 1) * Emacs generated Charmm Inputfile
> *
It's generally better to use a more descriptive title.
> ok 2) bomlev 0 (i don't know what means of this command though)
A negative bomlev will tolerate errors-- bomlev -1 will forgive typing
errors, so that CHARMM won't stop if you misspell a command, keyword or
filename. Very useful for interactive sessions. See miscom.doc
> ok 3) open read unit 1 card name
> /charmm/c27b4/toppar/top_all22_prot.inp
> read rtf card unit 1
> close unit 1
>
> ok 4) open read unit 1 card name
> /charmm/c27b4/toppar/par_all22_prot.inp
> read param card unit 1
> close unit 1
>
> -----------------------------------------------------------
> when I read param card unit 1, screen prints lots of
> information as following format
> PARRDR> Multiple terms for dihedral type: INDEX 99 CODE 1369603 CT2
> -SM -SM -CT2
> PARRDR> Multiple terms for dihedral type: INDEX 98 CODE 1369603 CT2 -SM
> -SM -CT2
> PARRDR> Multiple terms for dihedral type: INDEX 96 CODE 1367651 CT2 -C
> -NH1 -CT2
> ------------------------------------------------------------------
>
> I was supposed that it is right.
They are informational messages-- don't worry about for now, unless
you've actually modified the parameter file.
> ok?? 5) open read unit 1 card name "7lyz.pdb"
> read sequ pdb unit 1
> close unit 1
>
> ------charmm gives this information------
>
> ***** Message from SEQRDR ***** THE SYSTEM CONTAINS 18 TITRATABLE GROUPS
> THE USER MUST PREDETERMINE THE PROTONATION STATE THROUGH THE SEQUENCE AND
> RTF
> HIS - 0 ASP - 7 GLU - 2 LYS - 6 TYR - 3
>
> what does it means?
Default protonation states haev been used for the above residues, which
is typically okay for ASP, GLU, LYS, and TYR unless they are buried deep
inside the protein. The HIS residue is more complicated, because it has
neutral tautomers and a protonated state, and allowing the default could
be a serious mistake. Each form has a different residue name.
> --------------------------------------------
>
> ok?? 6) generate 7lyz
Possibly not okay; you should probably included the SETUP option to
create an internal coordinate (IC) table; more about this later.
>
> -------------------------------------
> CHARMM> generate 7lyz
> THE PATCH 'NTER ' WILL BE USED FOR THE FIRST RESIDUE
> THE PATCH 'CTER ' WILL BE USED FOR THE LAST RESIDUE
> GENPSF> Segment 1 has been generated. Its identifier is 7LYZ.
> PSFSUM> PSF modified: NONBOND lists and IMAGE atoms cleared.
> PSFSUM> Summary of the structure file counters :
> Number of segments = 1 Number of residues = 129
> Number of atoms = 1968 Number of groups = 586
> Number of bonds = 1988 Number of angles = 3547
> Number of dihedrals = 5183 Number of impropers = 351
> Number of HB acceptors = 186 Number of HB donors = 271
> Number of NB exclusions = 0 Total charge = 8.00000
> -----------------------------------------------
>
> not ok 7) open read unit 1 card name "7lyz.pdb"
> read coor pdb unit 1
> close unit 1
>
> --------------------------------------------------
> ** WARNING ** For atom in coordinate file, the corresponding residue in
> the PSF lacks that atom:
> INDEX= 996 IRES= 129 RESID=129 RES=LEU ATOM=O
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 2 1 LYS HT1
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 3 1 LYS HT2
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 4 1 LYS HT3
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 6 1 LYS HA
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 8 1 LYS HB1
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 9 1 LYS HB2
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 11 1 LYS HG1
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 12 1 LYS HG2
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 14 1 LYS HD1
> ** WARNING ** After reading, there are no coordinates for selected atom:
> 15 1 LYS HD2
> ** A total of 970 selected atoms have no coordinates
> *** LEVEL 2 WARNING *** BOMLEV IS 0
> ---------------------------------------------------------------------
Many PDB files do not have hydrogens, and even if they do, the names
probably aren't the same as those that CHARMM uses. THe typical
solution is build the H atom positions via either the IC BUILD
(intcor.doc) or HBUILD (hbuild.doc) commands (see below).
>
> ok?? 8) open write unit 2 card name "a.crd"
> write coor unit 2 card
> * 7lyz coor
> *
> close unit 2
Sort of-- none of the H atoms will have coords. Also, the file is
automatically closed when writing, hence the next message.
> -------------------------------------------------------
> CLOLGU> *****
> WARNING ***** Attempt to close unit that was not open.
> ---------------------------------------------------------------
>
> ok?? 9) open write unit 3 card name "a.psf"
> write psf unit 3 card
> * 7lyz psf
> *
> close unit 3
Likewise for CLOSE command-- not needed after a WRITE; the file should
be okay, unless the protein has disulfide bonds.
> =================================================================
>
> RESULTS:
>
> I have check a.crd and a.psf file. I found that there are 1968 atoms.
> Compared with original 1000 atom of 7lyz.pdf, there are more 968 atom in
> a.crd and a.psf files. and some atoms don't have coordinate since the
> coordinates were labelled 9999.0000..
>
> That is what I done. I felt these initial configurature may not be
> right. I have to resort to you guys.
The CHARMM forcefield represents all atoms, including non-polar H atoms,
explicitly; many crystal structures of proteins have only heavy atom
coordinates. Most of atoms w/o coordinates are probably H atoms.
Add the SETUP keyword to the GENERATE command (struct.doc), and after
reading the heavy atom coords from the PDB file you can (N.B. the ! char
starts a comment in CHARMM syntax):
ic purge ! CLEANUP IC TABLE
ic param ! GET MISSING BONDS AND ANGLES FROM PARAMETER FILE
ic build ! PLACE ANY MISSING COORDS, E.G. TERMINAL O ON CO2-
! CHECK FOR MISSING HEAVY ATOM COORDS
define test sele ( .not. type H* ) .and. ( .not. init ) show end
! USE HBUILD TO REBUILD H ATOMS; SPINS METHYLS, ETC. TO LOCAL MINIMUM
coor init sele type H* end
hbuild sele type H* end
! CHECK FOR ANY MISSING COORDS
define test sele .not. init show end
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=
Rick Venable
FDA/CBER/OVRR Biophysics Lab
1401 Rockville Pike HFM-419
Rockville, MD 20852-1448 U.S.A.
(301) 496-1905 Rick_Venable |-at-| nih.gov
ALT email: rvenable |-at-| speakeasy.org
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=