Date: Thu, 20 Jul 1995 11:07:15 -0500 (CDT)
From: Reece Kimball Hart
To: chemistry@ccl.net
Subject: ANNOUNCE: automated PDB retrieval
This is an announcement for getpdb, a ksh script which automates
incremental mirroring of the Protein Data Bank. It requires only
standard Unix utilities (ftp, sed, nawk, cut, zcat). The script
contains much more detail about features, usage, and requirements.
Comments are welcome.
getpdb may be obtained from:
http://dasher.wustl.edu/~reece/src/getpdb
ftp://dasher.wustl.edu/pub/getpdb/
-- Reece
Reece Kimball Hart | email: reece@dasher.wustl.edu
Biophysics & Biochemistry, Box 8231 | WWW: http://dasher.wustl.edu/~reece/
Washington Univ. School of Medicine | Phone: (314) 362-4198 (lab)
660 South Euclid | -7183 (fax)
St. Louis, Missouri 63110 (USA) | PGP public key available by finger & WWW
------------------ ORIGINAL README --------------------
#################################################################
## ##
## GETPDB -- Update a Local PDB Database via Anonymous ftp ##
## ##
#################################################################
GETPDB is a "simple" Unix shell script that updates your local copy
of the PDB database to match the current copy on the Brookhaven PDB
anonymous ftp server. It works by checking the size and timestamp of
all current PDB files on the Brookhaven server (as stored at BNL in
the file "all_entries/contents.lis") against the same info for your
local copy of the database (as stored locally in "files.list").
Files are retrieved into or deleted from your local database to cause
it to match the official Brookhaven version. A sample "files.list"
index file from July 1995 is included in this directory.
In addition, GETPDB strips from the distribution files any characters
in columns 71-80 and any blanks spaces from column 70 back to the
first nonblank space of each line. This results in a significant
space savings, approaching 20% for some files. The original PDB
files (pdbxxxx.ent) are deleted and the stripped files are retained
in the local database (as xxxx.pdb).
The GETPDB script is primarily intended for periodic updating of a
local copy of the database. The local "files.list" will also be
updated when GETPDB is run. The first time GETPDB is used at a site,
it will try to download the entire database, and create a "files.list"
index. This initial operation involves downloading thousands of files
totaling well over 1 Gb of disk space, and should only be performed
on evenings or weekends.
Local or modified versions of PDB files can be added to your master
PDB directory. As long as these files are not present in your copy
of "files.list", they will remain untouched by GETPDB.
The GETPDB script accepts a number of options on the command line
when it is invoked. These options and other features are described
in the comments found at the top and bottom of the script itself.
The script has been tested under Digital UNIX 3.2 and SGI IRIX 5.3,
but should run unmodified or with minor changes on other systems.
|