Cartesian to PDB Conversion - A Summary



 Dear Netters,
 Below is a list of the solutions I have received in my quest for a
 routine to convert cartesian coordinates into SYBYL readable
 PDB format. This list does not include Jan's posting of the
 fortran executables he has made available via anonymous ftp. This
 list also does not include the many generous offers of help in
 trying to solve this problem. Again thanks to one and all who
 responded to my query!
 -mark z.
 1)
         Mark,
 	The way I have been doing this type of transformations is as
         follows:
 	-with the XYZ file, do a run with Mopac using the 0SCF
          keyword. If your version of Mopac (I assume you have
          one) is 6.0, the output (*.out file) should give you the
          system's coordinates in both XYZ and INTERNAL forms.
         -from the output, copy the block that contains the system
          in internal coordinates (last block) into a file that must
          contain the upper part (everything, but the coordinates)
          of an *.arc file from Mopac. This new file must be named
          *.sta to be read by Sybyl. Also, in the just created *.sta
          file, the 3rd non-blank line (line containing the total
          number of atoms of every type) must be modified accordingly.
          Obviously, the starting *.arc file (if you don't have one)
          can be easily created by doing a Mopac run (in the order of
          seconds) of a simple system.
     NOTES:
 	-In my version of sybyl I have to change also the second
 	 non-blank line (version line) of the just created *.sta file to
          read "VERSION 5.00"; and I have to edit the coordinate
 columns
          to look exactly like the example below. These last operations
          can be readily carried out with a simple shell script (perhaps
          in "nawk" UNIX shell language).
 -------------------------------CUT HERE---------------------------------
                      SUMMARY OF   AM1   CALCULATION
                                                             VERSION  5.00
  AM1 BONDS EF
  /tmp_mnt/home/eby/rafael/sybyl/pbztPolymopac2.dat
  DEFINING A DUMMY ATOM (WHICH COINCIDES WITH THE Tv)
   C    0.0000000  0    0.000000  0    0.000000  0    0    0    0
   C    1.4023182  1    0.000000  0    0.000000  0    1    0    0
   C    1.3925309  1  120.491775  1    0.000000  0    2    1    0
   C    1.4018573  1  120.013080  1    0.011448  1    3    2    1
   C    1.4022088  1  119.501337  1   -0.006766  1    4    3    2
   C    1.4017758  1  119.481924  1   -0.008443  1    1    2    3
   H    1.1032095  1  120.192216  1 -179.994083  1    2    1    3
   H    1.1019895  1  119.576861  1 -179.990156  1    3    2    1
   H    1.1031646  1  120.204522  1 -179.998951  1    5    4    3
   H    1.1019215  1  120.416823  1 -179.997841  1    6    1    2
   C    1.4599093  1  121.035896  1  179.993477  1    4    3    2
   N    1.3231421  1  125.404264  1    0.033392  1   11    4    3
   S    1.7494256  1  119.181316  1 -179.962314  1   11    4    3
   C    1.4068053  1  109.848965  1 -179.998738  1   12   11    4
   C    1.4014290  1  125.023003  1 -179.995792  1   14   12   11
   C    1.3872028  1  118.164477  1 -179.999537  1   15   14   12
   C    1.4432826  1  121.214175  1    0.004160  1   16   15   14
   C    1.4014243  1  120.626432  1   -0.003837  1   17   16   15
   C    1.4432322  1  114.366623  1    0.004121  1   14   12   11
   H    1.1009352  1  120.485763  1    0.002677  1   15   14   12
   H    1.1008893  1  120.515321  1 -179.999542  1   18   17   16
   S    1.6921053  1  129.287470  1 -179.997306  1   16   15   14
   N    1.4067819  1  114.372176  1  179.996517  1   17   16   15
   C    1.3231197  1  109.830044  1    0.002209  1   23   17   16
  XX    1.5088405  1  125.461003  1  179.998250  1   24   23   17
  Tv   12.5597124  1    0.000000  0    0.000000  0    1   25   22
   0    0.0000000  0    0.000000  0    0.000000  0    0    0    0
 -------------------------------CUT HERE------------------------------------
 	Hope this helps !!
 					Con Saludos,
   Rafael G. Ramirez
   ------------------------------------------------------------------
   email : rafael' at \`eby.polymer.uakron.edu       phone:  (216) 972-5810
   usmail: Institute of Polymer Science        FAX  :  (216) 972-5290
           The University of Akron,
           Akron, OH 44325-3909
           U.  S.  A.
 2)
 #! /bin/sh
 awk '{printf "%s %6s %s %-3s %2s %5s %11.3f %7.3f %7.3f %s %s\n",
 "ATOM", NR, "", $1, "RES", "1", $2, $3,
 $4, " 1.00", " 0.00"
 }' $1
 The above shell script should convert simple cartesians into pdb
 format. The first $1 refers to column 1 of the input file which should
 contain the atom name. $2,$3,$4 refer to the cartesian coordinates, in
 this case in columns 2,3 and 4 of the input file. These can of course be
 changed if the atom name and coordinates are in different columns. The
 second $1 refers to the input file. Just put the above into a file, call
 it "con", make it executable and type con input_file > output_file.
 This should give a format readable as "pdb" format.
 Cheers
 Nick Tomkinson
 chs1nt' at \`surrey.ac.uk
 3)
 C
 C GAUPDB.FOR     This program transforms gaussian cartesian format
 C                into PDB files.  Using PDB format, you can read
 C                your coordinates into SYBYL.  Program was written
 C                and compiled on VAX under VMS.  Input files should
 C                be trimmed out of your Gaussian output and given
 C                the extension .XYZ.  Output files will have the extension
 C                .PDB.  See comments below!! Questions about the code:
 C
 C                Dr. Rick Gussio
 C                NCI-Biomedical Supercomputing Center
 C                P.O. Box B, Bldg. 430
 C                Frederick, MD 21202
 C
 C                Email: gussio' at \`ncifcrf.gov
 C
         CHARACTER FILENAME*35,OUTFILE*35,TITLE*80,WLINE*80,ATOM*4
 	CHARACTER TYPE*4,RES*3,TRSH1*1,TRSH2*2,TRSH3*3,TRSH4*4,SEGID*4
         CHARACTER FI*35
 	REAL X,Y,Z,W,PAF
 	INTEGER ATOMNO,TYPENO,RESNO,TOTATO,I
 C
 C line headings
 C
 	ATOM='ATOM'
 	PAF=0
 C
 C  formats for total line read
 C
 10 	FORMAT(A4)
 15	FORMAT(I5)
 20	FORMAT(A80)
 C
 C  a few formats
 C
 30      FORMAT(1X,I4,7X,I4,7X,3F12.6)
 40	FORMAT(A4,I7,2X,A3,1X,A3,3X,I3,4X,3F8.4)
 C
 C  prompt user for filename
 C
 	WRITE(6,*)'    '
 	WRITE(6,*)' Please ENTER the Filename WITHOUT the Extension : '
 	READ(6,43) FILENAME
         WRITE(6,43) FILENAME
 43	FORMAT(A35)
 C
 C  remove trailing spaces
 C
 	I=LAST(FILENAME,35)
 C
 C Find cartesian coordinates in the gaussian output file: eg.
 C
 C    1          8          -3.080796   -0.357418   -0.065404
 C    2          1          -4.063259   -0.115795   -0.258498
 C    3          1          -2.940725   -0.680416    0.902559
 C    4          7          -0.308698    0.805764   -0.011805
 C    5          8           0.966751    1.594701    0.016611
 C    6          7           1.058242   -1.403593   -0.026561
 C    7          8           2.333691   -0.614656    0.001856
 C
 C      Create a file, the file name should have
 C      the extension .XYZ
 C
  	OPEN(UNIT=1,FILE=FILENAME//'.XYZ',STATUS='OLD')
 C
 C  output file will have .PDB extension
 C
 	OUTFILE=FILENAME(1:I)//'.PDB'
  	 OPEN(UNIT=2,FILE=OUTFILE,STATUS='NEW',FORM='FORMATTED',
      +         ACCESS='SEQUENTIAL',CARRIAGECONTROL='LIST')
 C
 C
 C  read title lines
 C  file filter
 C
 80      CONTINUE
 C
 	READ(1,20,END=1000) WLINE
 	READ(WLINE,15) ATOMNO
 	READ(WLINE,30) ATOMNO,TYPENO,X,Y,Z
         WRITE(6,30)    ATOMNO,TYPENO,X,Y,Z
         RESNO = 1
         TRSH1 = 'G'
         RES = 'GAUS'
         TRSH= 'Z'
         IF( TYPENO .EQ. 1 ) TYPE= 'H   '
         IF( TYPENO .EQ. 6 ) TYPE= 'C   '
         IF( TYPENO .EQ. 7 ) TYPE= 'N   '
         IF( TYPENO .EQ. 8 ) TYPE= 'O   '
         IF( TYPENO .EQ. 15) TYPE= 'P   '
         IF( TYPENO .EQ. 16) TYPE= 'S   '
         IF( TYPENO .EQ. 17) TYPE= 'CL  '
         IF( TYPENO .LE. 0 ) TYPE= 'X   '
         TRSH3= 'G'
         SEQID='GAUS'
         TRSH4='G'
         RESID='GAUS'
         W=0.000
 C
 C write pdb file
 C
 	WRITE(2,40) ATOM, ATOMNO,TYPE,RES,RESNO,X,Y,Z
         GO TO 80
 C
 C close files
 C
 99      FORMAT(A3)
 1000	WRITE(2,99) 'TER'
         CLOSE(UNIT=2)
         CLOSE(UNIT=1)
 	STOP
 	END
 C
 C Appends extensions to filenames:
 C this function finds the last non blank character
 C
 	FUNCTION LAST(TEXT,N)
 	CHARACTER TEXT*(*)
 	DO 1 I=N,1,-1
 1	IF(TEXT(I:I) .NE.' ') GO TO 2
 	I=1
 2	LAST=I
 	RETURN
         END
 4)
    SYBYL has an interface that was originally written for the GAUSSIAN 86
 program, but I believe it will also write and read files for newer versions
 of GAUSSIAN.  From the command line, use the SYBYL command:
 SYBYL> GAUSS86 <molecule area> RETRIEVE <fileset name> GEOMETRY
 This command assumes a copy of the molecule exists in the molecule area
 you entered and will update the x,y,z coordinates of the moelcule using
 the coordinates in the GAUSSIAN output file.  Thus if you can somehow
 get a copy of your molecule (with any geometry, it doesn't matter how
 poor, it's only important that the atom numbering be the same as that
 used in the GAUSSIAN calculation) into SYBYL, this may be a way for
 you to read the GAUSSIAN structure into SYBYL.
 Note that the GAUSS86 RETRIEVE command will make no changes to
 connectivities or atom types.  It simply modifies the x,y,z coordinates
 of the atoms.
 One quick and dirty way to make a starting structure in SYBYL of your
 molecule that you could use with the GAUSS86 RETRIEVE command would be
 to use the SYBYL command:
 SYBYL> ADD RAWATOM M1 <atom name> <atom type> 0 0 0
 for each atom in the molecule.  This will place all the atoms on top
 of each other at 0,0,0.  This is OK though, because when you then use
 the GAUSS86 M1 RETRIEVE command, it will place the atoms at their correct
 x,y,z positions.  To quickly generate the bonds, use the SYBYL command
 SYBYL> CRYSIN M1 CONNECT * * NO_SYMMETRY_SEARCH BOND_LENGTH_TABLE
 after you have used the GAUSS86 command.  This will automatically create
 bonds based on distances between atoms.
 I hope this helps you with your problem.  If not, let me know and we
 can probably put together a little SPL script that will help you.
 Regards,
 Vic Lewchenko
 Tripos Associates, Inc.
 St.Louis, MO
 victor' at \`tripos.com
 5)
 	I had the same need for a conversion program to Sybyl before we
 got G92. If the newzmat utility doesn't work out for you (I haven't tried
 it), I'm sending you a simple fortran program written for the vax that worked
 for me. It converts cart coord to a pseudo-pdb format that SYBYL will read.
 Two things you may have to change around: the fortran format definitions to
 suit your needs, and atom types once the molecule is in SYBYL (no big deal).
 Along with the short program, I'm sending an example .com file you can use
 to mimic the format as well as the output you can try in SYBYL. Let me know
 if you have any questions or if you don't receive all the files.
 Happy Holidays
 the conversion program pdbfor.for:
 C Program to Convert Cartesian Coords to Sybyl readable PDB format
       DIMENSION X1(5000), Y(5000), Z(5000)
       INTEGER I,J,N,X
       REAL X1,Y,Z
       CHARACTER*80 NAME(5000), FNAME, JUNK(5000)
       I = 1
       X = 0
       DO 40 N = 1, 5000
          READ(5,25,ERR=45) JUNK(N)
  25      FORMAT(A80)
          IF (JUNK(N)(20:30) .EQ. '          ') GOTO 40
          READ(JUNK(N),30,ERR=45) NAME(N), X1(N), Y(N), Z(N)
  30      FORMAT(A12,3F9.6)
          X = X + 1
  40   CONTINUE
  45   DO 50 N=1, X
          IF (Y(N) .EQ. 0.0000) GOTO 50
          WRITE(6,99) ' ATOM',N,NAME(N)(1:4),'R01','1',X1(N),Y(N),Z(N)
  50   CONTINUE
  99   FORMAT(A5,4X,I3,1X,A4,1X,A3,5X,A1,5X,F7.3,1X,F7.3,1X,F7.3)
       END
 a sample .com file:
 $ mat
 $ assign bac10.out sys$output
 $ run pdbfor
 C1           9.69488   2.44667  -0.33565
 C10          9.75925  -0.54929  -2.39149
 C11         10.42360   0.69546  -1.92134
 C12         11.67762   0.68335  -1.44625
 C13         12.20292   1.88993  -0.73423
 C14         11.11370   2.53232   0.13327
 C15          9.59960   1.99327  -1.83249
 C16          8.15503   1.88341  -2.24958
 C17         10.22275   3.12909  -2.72714
 C18         12.62008  -0.46922  -1.61531
 C19          6.79800  -0.80531   0.14685
 C2           8.81938   1.51381   0.56641
 C20          8.11897  -0.10148   3.12819
 C21          8.17820   3.44656   3.62179
 C22          9.35497   3.25012   4.31283
 C23          9.55583   3.82641   5.46785
 C24          8.56960   4.52187   6.11817
 C25          7.36193   4.72855   5.49377
 C26          7.15077   4.18950   4.20918
 C27          7.89495   2.85724   2.32609
 C28         11.62870  -0.59770   2.41494
 C29         12.81577   0.02700   3.11955
 C3           9.20562  -0.01024   0.67253
 C30          9.86225  -1.37695  -4.62256
 C31          9.40133  -1.04179  -6.04783
 C4           9.28030  -0.45526   2.11508
 C5           8.95327  -1.94486   2.55438
 C6           8.41510  -2.90193   1.53510
 C7           8.68032  -2.47646   0.14314
 C8           8.31210  -0.96358  -0.16042
 C9           8.45115  -0.91890  -1.70539
 H10         10.40815  -1.27361  -1.85717
 H13         12.52995   2.71200  -1.46352
 H14         11.34803   3.49404   0.20608
 H141        10.96950   2.09754   0.97116
 H16          8.13443   1.40767  -3.08006
 H161         7.61942   1.29595  -1.53756
 H162         7.90267   2.91589  -2.40136
 H17         11.34545   3.30505  -2.50872
 H171         9.38845   3.38884  -2.90360
 H172        10.34120   2.57887  -3.74272
 H18         13.56767   0.01583  -1.94725
 H181        11.84242  -0.72804  -2.31745
 H182        13.15568  -0.46736  -0.86133
 H191         6.26498  -1.47843  -0.50224
 H192         6.47098  -0.03538  -0.28012
 H2           7.93357   1.63204   0.29739
 H20          7.53187   0.33144   2.62719
 H201         8.25287   0.53626   3.71311
 H22         10.11202   2.91775   3.70447
 H25          6.57912   5.31601   5.95529
 H26          6.26240   4.24443   3.63907
 H27         10.34377  -2.16737   0.52075
 H3          10.02448  -0.01676   0.29616
 H5           9.62792  -2.31260   3.15657
 H6           7.59368  -2.88796   1.83619
 H61          8.62367  -3.69793   1.95959
 H7           7.97478  -3.01923  -0.31714
 O1           9.04597   3.72214  -0.17646
 O10          9.44510  -0.39009  -3.78344
 O100        10.45965  -2.33960  -4.29185
 O13         13.29215   1.56781   0.10489
 O2           8.93267   2.10406   1.85964
 O20          6.85465   2.98106   1.71403
 O4          10.51373   0.03259   2.73084
 O40         11.70337  -1.53429   1.66220
 O5           7.92843  -1.46819   3.47865
 O7          10.04508  -2.73621  -0.22829
 O9           7.55763  -1.35740  -2.35447
 the output from the sample:
 ATOM      1 C1   R01     1       9.695   2.447  -0.336
 ATOM      2 C10  R01     1       9.759  -0.549  -2.391
 ATOM      3 C11  R01     1      10.424   0.695  -1.921
 ATOM      4 C12  R01     1      11.678   0.683  -1.446
 ATOM      5 C13  R01     1      12.203   1.890  -0.734
 ATOM      6 C14  R01     1      11.114   2.532   0.133
 ATOM      7 C15  R01     1       9.600   1.993  -1.832
 ATOM      8 C16  R01     1       8.155   1.883  -2.250
 ATOM      9 C17  R01     1      10.223   3.129  -2.727
 ATOM     10 C18  R01     1      12.620  -0.469  -1.615
 ATOM     11 C19  R01     1       6.798  -0.805   0.147
 ATOM     12 C2   R01     1       8.819   1.514   0.566
 ATOM     13 C20  R01     1       8.119  -0.101   3.128
 ATOM     14 C21  R01     1       8.178   3.447   3.622
 ATOM     15 C22  R01     1       9.355   3.250   4.313
 ATOM     16 C23  R01     1       9.556   3.826   5.468
 ATOM     17 C24  R01     1       8.570   4.522   6.118
 ATOM     18 C25  R01     1       7.362   4.729   5.494
 ATOM     19 C26  R01     1       7.151   4.189   4.209
 ATOM     20 C27  R01     1       7.895   2.857   2.326
 ATOM     21 C28  R01     1      11.629  -0.598   2.415
 ATOM     22 C29  R01     1      12.816   0.027   3.119
 ATOM     23 C3   R01     1       9.206  -0.010   0.673
 ATOM     24 C30  R01     1       9.862  -1.377  -4.622
 ATOM     25 C31  R01     1       9.401  -1.042  -6.048
 ATOM     26 C4   R01     1       9.280  -0.455   2.115
 ATOM     27 C5   R01     1       8.953  -1.945   2.554
 ATOM     28 C6   R01     1       8.415  -2.902   1.535
 ATOM     29 C7   R01     1       8.680  -2.476   0.143
 ATOM     30 C8   R01     1       8.312  -0.964  -0.160
 ATOM     31 C9   R01     1       8.451  -0.919  -1.705
 ATOM     32 H10  R01     1      10.408  -1.274  -1.857
 ATOM     33 H13  R01     1      12.530   2.712  -1.464
 ATOM     34 H14  R01     1      11.348   3.494   0.206
 ATOM     35 H141 R01     1      10.969   2.098   0.971
 ATOM     36 H16  R01     1       8.134   1.408  -3.080
 ATOM     37 H161 R01     1       7.619   1.296  -1.538
 ATOM     38 H162 R01     1       7.903   2.916  -2.401
 ATOM     39 H17  R01     1      11.345   3.305  -2.509
 ATOM     40 H171 R01     1       9.388   3.389  -2.904
 ATOM     41 H172 R01     1      10.341   2.579  -3.743
 ATOM     42 H18  R01     1      13.568   0.016  -1.947
 ATOM     43 H181 R01     1      11.842  -0.728  -2.317
 ATOM     44 H182 R01     1      13.156  -0.467  -0.861
 ATOM     45 H191 R01     1       6.265  -1.478  -0.502
 ATOM     46 H192 R01     1       6.471  -0.035  -0.280
 ATOM     47 H2   R01     1       7.934   1.632   0.297
 ATOM     48 H20  R01     1       7.532   0.331   2.627
 ATOM     49 H201 R01     1       8.253   0.536   3.713
 ATOM     50 H22  R01     1      10.112   2.918   3.704
 ATOM     51 H25  R01     1       6.579   5.316   5.955
 ATOM     52 H26  R01     1       6.262   4.244   3.639
 ATOM     53 H27  R01     1      10.344  -2.167   0.521
 ATOM     54 H3   R01     1      10.024  -0.017   0.296
 ATOM     55 H5   R01     1       9.628  -2.313   3.157
 ATOM     56 H6   R01     1       7.594  -2.888   1.836
 ATOM     57 H61  R01     1       8.624  -3.698   1.959
 ATOM     58 H7   R01     1       7.975  -3.019  -0.317
 ATOM     59 O1   R01     1       9.046   3.722  -0.176
 ATOM     60 O10  R01     1       9.445  -0.390  -3.783
 ATOM     61 O100 R01     1      10.460  -2.340  -4.292
 ATOM     62 O13  R01     1      13.292   1.568   0.105
 ATOM     63 O2   R01     1       8.933   2.104   1.860
 ATOM     64 O20  R01     1       6.855   2.981   1.714
 ATOM     65 O4   R01     1      10.514   0.033   2.731
 ATOM     66 O40  R01     1      11.703  -1.534   1.662
 ATOM     67 O5   R01     1       7.928  -1.468   3.479
 ATOM     68 O7   R01     1      10.045  -2.736  -0.228
 ATOM     69 O9   R01     1       7.558  -1.357  -2.354
 Hope this helps!
 Jeanne Bundens
 Bryn Mawr College
 jbundens' at \`cc.brynmawr.edu