From chemistry-request@server.ccl.net Sun Jan 28 10:50:29 2001
Received: from bbhm.na.pg.com (bbhm.na.pg.com [192.44.184.159])
	by server.ccl.net (8.11.0/8.11.0) with ESMTP id f0SFoTw05622
	for <CHEMISTRY@ccl.net>; Sun, 28 Jan 2001 10:50:29 -0500
X-ExtMailInfo:  <laidig@pg.com> pandora.na.pg.com [155.118.176.101]
Received: from pandora.na.pg.com (pandora.na.pg.com [155.118.176.101])
    by bbhm.na.pg.com (8.8.8/8.10.1/D1r2) with ESMTP id f0SFfmB07528; Sun, 28 Jan 2001 10:41:48 -0500 (EST)
Received: from morpheus.na.pg.com (morpheus.na.pg.com [143.5.24.124])
	by pandora.na.pg.com (SGI-8.9.3/8.9.3) with ESMTP id KAA47547;
	Sun, 28 Jan 2001 10:45:15 -0500 (EST)
Received: from pg.com by morpheus.na.pg.com via ESMTP (980427.SGI.8.8.8/930416.SGI.AUTO)
	 id KAA85208; Sun, 28 Jan 2001 10:47:53 -0500 (EST)
Sender: laidig@pg.com
Message-ID: <3A743F28.8FE4DD21@pg.com>
Date: Sun, 28 Jan 2001 10:47:53 -0500
From: Bill Laidig <laidig@pg.com>
X-Mailer: Mozilla 4.61C-SGI [en] (X11; I; IRIX64 6.5 IP30)
X-Accept-Language: en
MIME-Version: 1.0
To: CHEMISTRY@ccl.net, stein.pandora@pg.com
Subject: Queuing Software
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

All,

We have been looking into and testing queuing software for a while and
found nothing very satisfactory.  All have drawbacks such as price, lack
of adequate cross-platform support or features.  I am interested in what
queuing software CCL users USE on their systems and how they like this
software.  I will summarize responses to the list.  By the way, we are
looking for a queuing system that:

a) Works across both SGI Irix and Linux clusters (NQS works well on
linux, but we are having problems getting it work well on SGI's if I
remember the problems our systems manager mentioned).
b) Is not extremely expensive (LSF, for example, is quite nice, but
would be rather expensive to implement across our whole system)
c) Is full featured in terms of queue control (priority, job limits,
processor limits, time limits, ability to suspend and track jobs...)
d) Can treat a group of machines as a single queue (for our cluster)

Thanks in advance, Bill

--
************************************************************************
*    "Like jewels in a crown, the precious stones glittered in the     *
*     queen's round metal hat." - Jack Handey                          *
*                                                                      *
*     Bill Laidig                                                      *
*     The Procter & Gamble Co.             tel 513-627-2857 fax - 1233 *
*     Miami Valley Laboratories            laidig@pg.com (preferred)   *
*     P.O. Box 538707                      laidig.wd@pg.com            *
*     Cincinnati, OH 45253-8707                                        *
************************************************************************





From chemistry-request@server.ccl.net Sun Jan 28 19:10:12 2001
Received: from bellatrix.pcl.ox.ac.uk (IDENT:root@bellatrix.pcl.ox.ac.uk [163.1.35.134])
	by server.ccl.net (8.11.0/8.11.0) with ESMTP id f0T0ABw07680
	for <CHEMISTRY@ccl.net>; Sun, 28 Jan 2001 19:10:12 -0500
Received: from localhost (ben@localhost)
	by bellatrix.pcl.ox.ac.uk (8.9.3/8.8.7) with ESMTP id AAA18170;
	Mon, 29 Jan 2001 00:10:11 GMT
Date: Mon, 29 Jan 2001 00:10:10 +0000 (GMT)
From: Ben Webb <ben@bellatrix.pcl.ox.ac.uk>
To: Bill Laidig <laidig@pg.com>
cc: CHEMISTRY@ccl.net, stein.pandora@pg.com
Subject: Re: CCL:Queuing Software
In-Reply-To: <3A743F28.8FE4DD21@pg.com>
Message-ID: <Pine.LNX.4.21.0101282351560.17915-100000@bellatrix.pcl.ox.ac.uk>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

On Sun, 28 Jan 2001, Bill Laidig wrote:

> We have been looking into and testing queuing software for a while and
> found nothing very satisfactory.  All have drawbacks such as price, lack
> of adequate cross-platform support or features.  I am interested in what
> queuing software CCL users USE on their systems and how they like this
> software.

	We use Open PBS (www.openpbs.com) here and it works well for our
needs (distribution of single-processor and MPI jobs across a network of
Linux-based workstations and compute nodes). It is available under a
fairly liberal licence, and is free for both commercial and non-commercial
use provided you are registered with its owners. The "P" in PBS stands for
"portable" and as such it works on a variety of platforms (although we've
only tried it on Intel and Alpha Linux boxes so far here). It is fairly
feature-rich by default, but since the source code is available, you can
augment its features (or "bugs" if you like ;) with your own. If that's
still not enough, you can get the "full" version, PBS Pro, but that costs
money...

> a) Works across both SGI Irix and Linux clusters (NQS works well on
> linux, but we are having problems getting it work well on SGI's if I
> remember the problems our systems manager mentioned).

	If you've used NQS in the past, PBS should be relatively easy to
pick up, as it's based (at least in part) on NQS. There are even utilities
for converting NQS job scripts to PBS format.

> c) Is full featured in terms of queue control (priority, job limits,
> processor limits, time limits, ability to suspend and track jobs...)

	This is all possible by default, I believe, with the exception of
processor limits (you can restrict jobs by nodes - machines - but not by
processors). Patches are available for this functionality, however (for
instance, from our own site, http://bellatrix.pcl.ox.ac.uk/~ben/pbs/). PBS
is flexible in that this kind of thing is generally determined by a
dedicated scheduler program, which you can either provide from scratch (in
one of three different languages) or can derive from the example
"FIFO" scheduler.

> d) Can treat a group of machines as a single queue (for our cluster)

	This is the default behaviour. More complex behaviour can be
obtained by tinkering with the scheduler. Most "unusual" requirements have
already been addressed by other researchers, however, and you can usually
find patches that do something similar to what you want on the mailing
list, which is generally quite responsive.

	Ben
-- 
ben@bellatrix.pcl.ox.ac.uk           http://bellatrix.pcl.ox.ac.uk/~ben/
"God does not play dice with the universe."
	- Albert Einstein



