CCL: Calculations on a DRBL cluster
- From: Alexander Martins Silva <alex.msilva..uol.com.br>
- Subject: CCL: Calculations on a DRBL cluster
- Date: Sun, 29 Jan 2006 00:30:55 +0000
Sent to CCL by: Alexander Martins Silva [alex.msilva#%#uol.com.br]
Hi CCLers,
I have installed a small linux cluster with the DRBL
program (drbl.sourceforge.net). I can execute small parallel jobs with
mpi and even short Gamess calculations. However, the parallel jobs
stops
without error message when these jobs require some hours of
calculation. Even when I start serial jobs on each node the error is
the same. The same job can stop at differents points. What's
happening? How can I solve this? I'm using a Mandrake 10/Athlon XP
2600
server machine with Raid1, 10 P4 clients and a PLanet
24port/gigabit/switch.
Any suggestion or advise is welcome.
Thanks,
Alexander.