CCL Home Page
Up Directory CCL README
Date: Thu, 20 Jul 1995 11:07:15 -0500 (CDT)
From: Reece Kimball Hart 
To: chemistry@ccl.net
Subject: ANNOUNCE: automated PDB retrieval

This is an announcement for getpdb, a ksh script which automates
incremental mirroring of the Protein Data Bank.  It requires only
standard Unix utilities (ftp, sed, nawk, cut, zcat).  The script
contains much more detail about features, usage, and requirements.
Comments are welcome.

getpdb may be obtained from:
http://dasher.wustl.edu/~reece/src/getpdb
ftp://dasher.wustl.edu/pub/getpdb/

-- Reece
Reece Kimball Hart                  | email: reece@dasher.wustl.edu
Biophysics & Biochemistry, Box 8231 | WWW:   http://dasher.wustl.edu/~reece/
Washington Univ. School of Medicine | Phone: (314) 362-4198 (lab)
660 South Euclid                    |                 -7183 (fax)
St. Louis, Missouri  63110    (USA) | PGP public key available by finger & WWW


------------------  ORIGINAL README --------------------


     #################################################################
     ##                                                             ##
     ##  GETPDB  --  Update a Local PDB Database via Anonymous ftp  ##
     ##                                                             ##
     #################################################################

     GETPDB is a "simple" Unix shell script that updates your local copy
     of the PDB database to match the current copy on the Brookhaven PDB
     anonymous ftp server. It works by checking the size and timestamp of
     all current PDB files on the Brookhaven server (as stored at BNL in
     the file "all_entries/contents.lis") against the same info for your
     local copy of the database (as stored locally in "files.list").
     Files are retrieved into or deleted from your local database to cause
     it to match the official Brookhaven version. A sample "files.list"
     index file from July 1995 is included in this directory.

     In addition, GETPDB strips from the distribution files any characters
     in columns 71-80 and any blanks spaces from column 70 back to the 
     first nonblank space of each line. This results in a significant 
     space savings, approaching 20% for some files. The original PDB
     files (pdbxxxx.ent) are deleted and the stripped files are retained
     in the local database (as xxxx.pdb).
      
     The GETPDB script is primarily intended for periodic updating of a
     local copy of the database. The local "files.list" will also be
     updated when GETPDB is run. The first time GETPDB is used at a site,
     it will try to download the entire database, and create a "files.list"
     index. This initial operation involves downloading thousands of files
     totaling well over 1 Gb of disk space, and should only be performed
     on evenings or weekends.

     Local or modified versions of PDB files can be added to your master
     PDB directory. As long as these files are not present in your copy
     of "files.list", they will remain untouched by GETPDB.

     The GETPDB script accepts a number of options on the command line
     when it is invoked. These options and other features are described
     in the comments found at the top and bottom of the script itself.
     The script has been tested under Digital UNIX 3.2 and SGI IRIX 5.3,
     but should run unmodified or with minor changes on other systems.
Modified: Sat Jul 22 16:00:00 1995 GMT
Page accessed 7503 times since Fri Jan 31 07:25:32 2003 GMT