CCL: PDB file format : List of atom name / element type correspond
- From: "Konrad Hinsen"
<hinsen{=}cnrs-orleans.fr>
- Subject: CCL: PDB file format : List of atom name / element type
correspond
- Date: Thu, 5 Mar 2009 09:28:42 -0500
Sent to CCL by: "Konrad Hinsen" [hinsen^^cnrs-orleans.fr]
On 05.03.2009, at 12:55, Peter Schmidtke pschmidtke^mmb.pcb.ub.es wrote:
> I am looking for a list of atom names that occur in PDB files and their
corresponding elements.
For files that fully respect the PDB conventions, you can find all the
information you need in the
PDB format definition:
http://www.wwpdb.org/docs.html
As you noted, recent versions of the PDB format conventions state that the
element should be
given explicitly in columns 77-78. For files conforming to older versions of the
PDB format, you
can obtain the element as the first two letters of the atom name. If the first
letter is a digit, then
you should use only the second letter (which must then be H).
However, the real problem is that the vast majority of programs write PDB files
that do not fully
respect the PDB file format conventions. The summary of my ten-year experience
in dealing with
PDB files of various origins is that you can't rely on anything concerning atom
and residue names.
There is no way to avoid special treatments for files produced by specific
programs.
My advice is:
1) If you only need to handle PDB files from a specific source, write code
specifically for that variety
of the PDB format and get on with your science.
2) If you want to write a general tool that can handle a large variety of PDB
files, use existing PDB
parser and support libraries that do at least part of the nasty work.
In the category of existing libraries, I'll mention my own which are written in
Python:
- Module Scientific.IO.PDB in Scientific Python:
http://dirac.cnrs-orleans.fr/ScientificPython/
- Modules MMTK.PDB and MMTK.PDBMoleculeFactory in the Molecular Modelling
Toolkit:
http://dirac.cnrs-orleans.fr/MMTK/
--
---------------------------------------------------------------------
Konrad Hinsen
Centre de Biophysique Molculaire, CNRS Orlans
Synchrotron Soleil - Division Expriences
Saint Aubin - BP 48
91192 Gif sur Yvette Cedex, France
Tel. +33-1 69 35 97 15
E-Mail: hinsen|cnrs-orleans.fr
---------------------------------------------------------------------