|
From: jobs at ccl.net (do not send your application there!!!)
To: jobs at ccl.net
Date: Fri Oct 11 22:24:21 2024
Subject: 24.10.11 System Administrator, National Heart, Lung, and Blood Institute, Bethesda, USA
System Administrator, National Heart, Lung, and Blood Institute, Bethesda, USA
The Laboratory of Computational Biophysics (LCB) is a group of researchers
who employ computational simulation methods to investigate problems in
biophysics and chemistry using the Linux-based LoBoS high-performance
computing (HPC) cluster within the National Heart, Lung, and Blood Institute
(NHLBI) at the National Institutes of Health
( https://www.lobos.nih.gov/LoBoS.shtml ).
LoBoS consists of several hundred
CPU/GPU computational nodes, three tiers of storage (home directories,
scratch space, and archive), associated network infrastructure (both
Infiniband and Ethernet), and Linux desktops for users.
This position is for the day-to-day management of the LoBoS HPC compute
nodes, storage systems, and desktops. The position involves working as part
of a small team (at least two people) whose primary responsibilities are to
keep the cluster running in good order and ensuring the cluster follows
security best-practices as determined by the NIH and Department of Health and
Human Services. It also involves maintaining the usability of the LoBoS
cluster via yearly purchase and installation of hardware to replace aging
components.
ABOUT THE POSITION
Oversee that various components of the LoBoS cluster stay in good
working order such as network configuration, firewall management (Palo Alto),
file system management (ZFS, VAST), security, batch queuing systems (SLURM),
database administration, distributed computing, file transfer services, web
servers, and electronic mailing lists.
May occasionally require work outside normal 9-5 hours in order to
address emergency situations with the cluster (e.g. significant numbers of
down nodes, storage outages, etc.) or cybersecurity incidents (FISMA).
Ensure that the LoBoS cluster has sufficient capabilities to run the
scientific software needed by the LCB scientists. Evaluates the existing
system to determine when updates/upgrades to hardware and/or software are
necessary. Responsible for managing the budget used to procure new
hardware/software for LoBoS. Oversee configuration and installation of
virtual and physical servers and manage upgrades to existing hardware.
Ensure that patches, security updates, and configuration changes to
software systems are applied to enhance reliability and to meet security
needs. Collaborate with OCIO, CIT, and NHLBI security teams to ensure
adherence to compliance policies.
Assist in maintaining the LoBoS Assessment & Authorization package
based on National Institute of Standards and Technology SP 800-53 security
controls under guidance from NHLBI's Information System Security Officers.
Serve as a technical resource for HPC, LCB, NHLBI, and other NIH
personnel in areas such as the Linux operating system, networking, database
system administration, distributed computing. May serve on technical
evaluation panels for institute-wide initiatives.
Technology tracking: Stay informed regarding new developments in
hardware/software, and evaluate their potential usability for LoBoS/LCB.
Participates in conferences and meetings of professional groups concerned
with the application of HPC, AI/machine learning, and other emerging computer
technologies.
Prepare software documentation and technical reports related to
assigned projects.
ABOUT YOUR BACKGROUND
5+ years of experience in Linux HPC systems administration is
preferred. However, less experienced candidates with outstanding
qualifications will also be considered.
Comprehensive knowledge of shell scripting. Broad knowledge of
systems administration tools (e.g. Puppet, Ansible, etc.) along with a
detailed knowledge of tools used in a particular area such as file system
management, usage accounting, mail configuration, database system
administration, file transfer, or security.
Experience with government computer security rules and standards is
desirable.
Extensive knowledge of at least two high level computer languages
such as C, C++, FORTRAN, Ruby, Perl, or Python is desirable.
Experience implementing and managing SLURM batch queueing software
preferred.
Solid interpersonal, leadership, and critical thinking skills.
Excellent written and oral communication skills.
ADDITIONAL INFORMATION
Location: 9000 Rockville Pike, Bethesda, Maryland, which is
accessible via bus/bicycle/Metro (Red Line: Medical Center).
Some travel to professional meetings (e.g. Super Computing
Conference) may occasionally be required.
Some remote work is acceptable (up to 3 days per week).
Employment type: full-time government contractor.
Salary range: From $100,000 to $180,000/year, which will be
commensurate with education and experience.
A selection of health and wellness benefits will be offered.
HOW TO APPLY
To be considered, please submit your resume and cover letter to Dr. Daniel R.
Roe at daniel.roe**nih.gov with the subject heading of System Administrator.
Appointees must be U.S. citizens, or Permanent Resident Card holders.
Applications should be submitted by November 4, 2024.
We are an equal opportunity employer, and we actively prohibit discrimination
and harassment of any kind. We strongly encourage people of color, LGBTQ+
people, immigrants, women, and people who are differently-abled to apply.
NOTE THAT E-MAIL ADDRESSES HAVE BEEN MODIFIED!!!
All @ signs were changed to ** to fight spam. Before you send e-mail, you
need to change ** to @
For example: change joe**big123comp.com to joe@big123comp.com
Please let your prospective employer know that you learned
about the job from the Computational Chemistry List Job Listing at
https://server.ccl.net/jobs.
|